Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix capi-kubeadmconfig rule for hybrid providers #1494

Merged
merged 2 commits into from
Feb 10, 2025

Conversation

hervenicol
Copy link
Contributor

Towards https://github.com/giantswarm/giantswarm/issues/32587

This PR fixes multi-provider routing for KubeadmConfigNotReady alert.

Checklist

@hervenicol hervenicol requested a review from a team February 7, 2025 18:33
@hervenicol hervenicol self-assigned this Feb 7, 2025
@hervenicol hervenicol requested a review from a team as a code owner February 7, 2025 18:33
expr: capi_kubeadmconfig_status_condition{type="Ready", status="False"} > 0
expr: |-
(
app_operator_app_info{status="not-installed", catalog=~"giantswarm|cluster|default", team!~"^$|noteam"}
Copy link
Contributor

@QuentinBisson QuentinBisson Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does that have to do with the app operator? I think the original metrics should bé capi_kubeadmconfig_status_condition{type="Ready right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

welp, what did I do!?? I screwed the query with a broken copy/paste, and there were no unit tests to catch it! 😱
Thanks for catching it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really wonder how I did it, because my Grafana Explore panel was actually using the right query 🤔
image

Anyway, hopefully I've now pushed the proper one 🤞

Copy link
Contributor

@QuentinBisson QuentinBisson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hervenicol hervenicol merged commit fdf2914 into main Feb 10, 2025
7 checks passed
@hervenicol hervenicol deleted the fix-kubeadmconfig-multiprovider branch February 10, 2025 08:59
Comment on lines +13 to +22
expr: |-
(
capi_kubeadmconfig_status_condition{type="Ready", status="False"}
* on(cluster_id) group_left(provider)
sum(
label_replace(
capi_cluster_info, "provider", "vsphere", "infrastructure_reference_kind", "VSphereCluster"
)
) by (cluster_id, provider)
) > 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an expert on PromQL so I may be miss understanding the query, but aren't we basically disabling the alert for non vsphere clusters? We want this alert for all providers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at all :)

What we are doing here is replace the provider label to vsphere if the cluster type is a VSphereCluster

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes more sense 😄 Thanks! I guess the same would be needed for cloud-director?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need it for cloud-director because we do not have cloud director WCs running on cloud MCs yet.

We might need this for promox and so on i guess

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I would prefer that we try to find a better solution in the long run :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good. Thank y'all for the explanations!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants